675 research outputs found

    The SkipTrie: low-depth concurrent search without rebalancing

    Get PDF
    To date, all concurrent search structures that can support predecessor queries have had depth logarithmic in m, the number of elements. This paper introduces the SkipTrie, a new concurrent search structure supporting predecessor queries in amortized expected O(log log u + c) steps, insertions and deletions in O(c log log u), and using O(m) space, where u is the size of the key space and c is the contention during the recent past. The SkipTrie is a probabilistically-balanced version of a y-fast trie consisting of a very shallow skiplist from which randomly chosen elements are inserted into a hash-table based x-fast trie. By inserting keys into the x-fast-trie probabilistically, we eliminate the need for rebalancing, and can provide a lock-free linearizable implementation. To the best of our knowledge, our proof of the amortized expected performance of the SkipTrie is the first such proof for a tree-based data structure.National Science Foundation (U.S.) (Grant CCF-1217921)United States. Dept. of Energy. Office of Advanced Scientific Computing Research (Grant ER26116/DE-SC0008923)Oracle CorporationIntel Corporatio

    Towards consistency oblivious programming

    Get PDF
    15th International Conference, OPODIS 2011, Toulouse, France, December 13-16, 2011. ProceedingsIt is well known that guaranteeing program consistency when accessing shared data comes at the price of degraded performance and scalability. This paper initiates the investigation of consistency oblivious programming (COP). In COP, sections of concurrent code that meet certain criteria are executed without checking for consistency. However, checkpoints are added before any shared data modification to verify the algorithm was on the right track, and if not, it is re-executed in a more conservative and expensive consistent way. We show empirically that the COP approach can enhance a software transactional memory (STM) framework to deliver more efficient concurrent data structures from serial source code. In some cases the COP code delivers performance comparable to that of more complex fine-grained structures

    Admit your weakness: Verifying correctness on TSO architectures

    Get PDF
    “The final publication is available at http://link.springer.com/chapter/10.1007%2F978-3-319-15317-9_22 ”.Linearizability has become the standard correctness criterion for fine-grained non-atomic concurrent algorithms, however, most approaches assume a sequentially consistent memory model, which is not always realised in practice. In this paper we study the correctness of concurrent algorithms on a weak memory model: the TSO (Total Store Order) memory model, which is commonly implemented by multicore architectures. Here, linearizability is often too strict, and hence, we prove a weaker criterion, quiescent consistency instead. Like linearizability, quiescent consistency is compositional making it an ideal correctness criterion in a component-based context. We demonstrate how to model a typical concurrent algorithm, seqlock, and prove it quiescent consistent using a simulation-based approach. Previous approaches to proving correctness on TSO architectures have been based on linearizabilty which makes it necessary to modify the algorithm’s high-level requirements. Our approach is the first, to our knowledge, for proving correctness without the need for such a modification

    Lock cohorting: A general technique for designing NUMA locks

    Get PDF
    Multicore machines are quickly shifting to NUMA and CC-NUMA architectures, making scalable NUMA-aware locking algorithms, ones that take into account the machines' non-uniform memory and caching hierarchy, ever more important. This paper presents lock cohorting, a general new technique for designing NUMA-aware locks that is as simple as it is powerful. Lock cohorting allows one to transform any spin-lock algorithm, with minimal non-intrusive changes, into scalable NUMA-aware spin-locks. Our new cohorting technique allows us to easily create NUMA-aware versions of the TATAS-Backoff, CLH, MCS, and ticket locks, to name a few. Moreover, it allows us to derive a CLH-based cohort abortable lock, the first NUMA-aware queue lock to support abortability. We empirically compared the performance of cohort locks with prior NUMA-aware and classic NUMA-oblivious locks on a synthetic micro-benchmark, a real world key-value store application memcached, as well as the libc memory allocator. Our results demonstrate that cohort locks perform as well or better than known locks when the load is low and significantly out-perform them as the load increases

    Deriving the Golden Ratio using a Scientific Calculator

    Get PDF
    In this article, we illustrate the use of a scientific calculator for exploring the Fibonacci sequence and the Golden Ratio. The calculator, equipped with a feature for setting up iterations, enables the process of creating a sequence which rapidly converges to the Golden Ratio

    ThreadScan: Automatic and Scalable Memory Reclamation

    Get PDF
    The concurrent memory reclamation problem is that of devising a way for a deallocating thread to verify that no other concurrent threads hold references to a memory block being deallocated. To date, in the absence of automatic garbage collection, there is no satisfactory solution to this problem. Existing tracking methods like hazard pointers, reference counters, or epoch-based techniques like RCU, are either prohibitively expensive or require significant programming expertise, to the extent that implementing them efficiently can be worthy of a publication. None of the existing techniques are automatic or even semi-automated. In this paper, we take a new approach to concurrent memory reclamation: instead of manually tracking access to memory locations as done in techniques like hazard pointers, or restricting shared accesses to specific epoch boundaries as in RCU, our algorithm, called ThreadScan, leverages operating system signaling to automatically detect which memory locations are being accessed by concurrent threads. Initial empirical evidence shows that ThreadScan scales surprisingly well and requires negligible programming effort beyond the standard use of Malloc and Free

    Correctness of an STM Haskell implementation

    Get PDF
    A concurrent implementation of software transactional memory in Concurrent Haskell using a call-by-need functional language with processes and futures is given. The description of the small-step operational semantics is precise and explicit, and employs an early abort of conflicting transactions. A proof of correctness of the implementation is given for a contextual semantics with may- and should-convergence. This implies that our implementation is a correct evaluator for an abstract specification equipped with a big-step semantics

    Training-Free Uncertainty Estimation for Dense Regression: Sensitivity as a Surrogate

    Full text link
    Uncertainty estimation is an essential step in the evaluation of the robustness for deep learning models in computer vision, especially when applied in risk-sensitive areas. However, most state-of-the-art deep learning models either fail to obtain uncertainty estimation or need significant modification (e.g., formulating a proper Bayesian treatment) to obtain it. Most previous methods are not able to take an arbitrary model off the shelf and generate uncertainty estimation without retraining or redesigning it. To address this gap, we perform a systematic exploration into training-free uncertainty estimation for dense regression, an unrecognized yet important problem, and provide a theoretical construction justifying such estimations. We propose three simple and scalable methods to analyze the variance of outputs from a trained network under tolerable perturbations: infer-transformation, infer-noise, and infer-dropout. They operate solely during inference, without the need to re-train, re-design, or fine-tune the model, as typically required by state-of-the-art uncertainty estimation methods. Surprisingly, even without involving such perturbations in training, our methods produce comparable or even better uncertainty estimation when compared to training-required state-of-the-art methods.Comment: 18 pages, 13 figure

    Quiescent consistency: Defining and verifying relaxed linearizability

    Get PDF
    Concurrent data structures like stacks, sets or queues need to be highly optimized to provide large degrees of parallelism with reduced contention. Linearizability, a key consistency condition for concurrent objects, sometimes limits the potential for optimization. Hence algorithm designers have started to build concurrent data structures that are not linearizable but only satisfy relaxed consistency requirements. In this paper, we study quiescent consistency as proposed by Shavit and Herlihy, which is one such relaxed condition. More precisely, we give the first formal definition of quiescent consistency, investigate its relationship with linearizability, and provide a proof technique for it based on (coupled) simulations. We demonstrate our proof technique by verifying quiescent consistency of a (non-linearizable) FIFO queue built using a diffraction tree. © 2014 Springer International Publishing Switzerland
    corecore